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Although it is well understood that selection shapes the polymorphism pattern in Drosophila, signatures of classic selective 
sweeps are scarce. Here, we focus on Drosophila mauritiana, an island endemic, which is closely related to Drosophila melano- 
gaster. Based on a new, annotated genome sequence, we characterized the genome-wide polymorphism by sequencing 
pooled individuals [Pool-seq). We show that the interplay between selection and recombination results in a genome-wide 
polymorphism pattern characteristic for D. mauritiana. Two large genomic regions (>500 kb) showed the signature of almost 
complete selective sweeps. We propose that the absence of population structure and limited geographic distribution could 
explain why such pronounced sweep patterns are restricted to D. mauritiana. Further evidence for strong adaptive evolution 
was detected for several nucleoporin genes, some of which were not previously identified as genes involved in genomic 
conflict. Since this adaptive evolution is continuing after the split of D. mauritiana and Drosophila simulans, we conclude that 
genomic conflict is not restricted to short episodes, but rather an ongoing process in Drosophila. 



[Supplemental material is available for this article.] 

Intragenomic conflict describes the phenomenon that within an 
organism some genetic elements (e.g., segregation distorters) in- 
crease their transmission at the expense of others (Werren 2011). 
Due to the preferential transmission, such elements spread in the 
population and can leave a characteristic trace of strongly reduced 
variability in the genome that resembles a selective sweep (Derome 
et al. 2004). Population genetic analyses of segregation distortion 
systems in Drosophila did not find a molecular signature similar to 
a classic selective sweep (Derome et al. 2004, 2008; Presgraves et al. 
2009; Kingan et al. 2010; Bastide et al. 2011). The patterns of var- 
iability instead resembled partial selective sweeps, suggesting that 
the genetic element increased in frequency but did not reach fix- 
ation. This observation is consistent with the fact that elements of 
intragenomic conflict are frequently deleterious when homozy- 
gous (Wallace 1948; Curtsinger and Feldman 1980) or that sup- 
pressors of the intragenomic conflict have evolved (Hamilton 
1967). 

In the Drosophila melanogaster complex, only a small number 
of genes involved in intragenomic conflict have been identified 
within natural populations (e.g., Sandler et al. 1959; Mercot et al. 
1995). While this may suggest that intragenomic conflict is a rela- 
tively rare event, it needs to be considered that there is a strong 
ascertainment bias: The rapid spread of driver alleles is either pre- 
vented by a quick fixation of suppressor alleles, or, in case of sex 
chromosome-linked segregation distorters, populations with an 
advanced intragenomic conflict become extinct (Gershenson 
1928; Hamilton 1967; Lyttle 1977). In both cases, past episodes of 
genomic conflict cannot be recognized in an intraspecific poly- 
morphism analysis. 

Indeed, consistent with the idea that genomic conflict is 
a common phenomenon, detailed analysis of hybrids showed that 
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"speciation" genes tend to be involved in intragenomic conflict, 
but their effect could be only detected in hybrids (Perez et al. 1993; 
Dermitzakis et al. 2000; Tao et al. 2001; Presgraves et al. 2003; 
Phadnis and Orr 2009; Tang and Presgraves 2009). 

Several genes involved in intragenomic conflict in Drosophila 
were discovered in the Drosophila simulans clade that consists of 
three recently diverged species, the cosmopolitan D. simulans and 
the island endemics Drosophila mauritiana and Drosophila sechellia. 
D. mauritiana was the first species for which a "speciation" gene 
could be characterized at the molecular level: In hybrid crosses 
with D. simulans, the Odysseus (OdsH) allele of D. mauritiana to- 
gether with additional tightly linked factors causes hybrid male 
sterility in the Fi generation (Perez and Wu 1995; Ting et al. 1998) 
and has been later identified as a gene involved in genomic conflict 
(Bayes and Malik 2009). Another D. mauritiana gene, too much yin 
(tmy), causes both, hybrid male sterility and segregation distortion 
in crosses between D. mauritiana andD. simulans (Tao et al. 2001), 
whereas the heterochromatic hlx locus causes hybrid lethality be- 
tween D. mauritiana and both of its sister species (Cattani and 
Presgraves 2009). 

Additional elements of intragenomic conflict have been 
identified between the more distantly related D. melanogaster and 
D. simulans, in which the interaction between the genes Hmr and 
Lhr contributes to hybrid male lethality in crosses between 
D. melanogaster and D. simulans (Brideau et al. 2006; Maheshwari 
and Barbash 2012). 

The D. simulans alleles of two nucleoporin genes, Nup96^ 
and Nupl60, cause recessive male lethality when crossed to a 
D. melanogaster X chromosome (Presgraves et al. 2003; Tang and 
Presgraves 2009), a phenomenon that has been also linked to 



^Throughout the manuscript, we refer to "Nup96 gene" as the part of the 
Nup98-96 gene that corresponds to amino acid residues 1029-1961 in the 
resulting protein; this part of the protein is frequently referred to as "NUP96." 
Similarly, we refer to "Nup98 gene" as the part of the Nup98-96 gene that 
corresponds to amino acid residues 1-1 028 in the resulting protein; this part of 
the protein is frequently referred to as "NUP98," e.g., in Presgraves et al. 
(2003). 



23:99-110 © 2013, Published by Cold Spring Harbor Laboratory Press; ISSN 1088-9051/13; www.genome.org 



Genome Research 99 

www.genome.org 



Nolte et al. 



genomic conflict (Presgraves 2007; Presgraves and Stephan 2007). 
While the NUP96 protein is highly conserved between D. simu- 
lans and D. mauritiana, the D. mauritiana allele of Nup96 has no 
hybrid-lethal effect, which suggests more complex genetic in- 
teractions leading to Nw/>96-dependent incompatibility (Barbash 
2007). 

Despite the importance of D. mauritiana as a model for un- 
derstanding the genetic basis of speciation, an annotated genome 
sequence is not yet available. Using de novo assembly we gener- 
ated a draft genome of D. mauritiana and estimated genome-wide 
polymorphism patterns from Pool-seq data. Our data show the 
impact of genes involved in genomic conflict on the evolution of 
the D. mauritiana lineage. Nucleoporin genes, implicated in hybrid 
incompatibilities that have evolved between D. simulans and 
D. melanogaster, are possible targets of recurrent positive selection 
due to ongoing genomic conflict (Presgraves and Stephan 2007). 
Unlike previous genome-wide polymorphism surveys of D. simulans 
andD. melanogaster (Begun et al. 2007; Langley et al. 2012), we find 
that in the D. mauritiana lineage, nucleoporins are among the 
genes showing the strongest evidence of recurrent adaptive evo- 
lution. Furthermore, the presence of a pair of meiotic drive genes 
and a "speciation" gene at the center of two valleys of strongly 
reduced variability suggests that these sweeps have been caused by 
genes involved in genomic conflict. 



Results 

The recent advances in sequencing technology provide the op- 
portunity to perform population genetic analyses on a genome 
scale. Even for species with no available reference genome, it has 
become feasible to generate draft genomes that can be used for 
population genomic analysis. Here we pursue this strategy for 
D. mauritiana, for which no annotated reference genome is available 
yet. We sequenced the D. mauritiana strain MS 17 using a mixture 
of single-end and paired-end Illumina reads (Supplemental Table 
SI), and assembled and annotated the draft genome (for further 
details, see Supplemental Results). To study the impact of selection 
on the polymorphism pattern in D. mauritiana, we sequenced a 
pool of 107 isofemale lines (Supplemental Table S2). 

Faster rate of evolution on the X chromosome 

Since the X chromosome is hemizygous in males, rates of sequence 
evolution can be contrasted between the X chromosome and the 
autosomes to shed some light on the operating selective forces. 
Under the assumption that new mutations are recessive, pop- 
ulation genetics theory predicts a higher rate of evolution on the X 
chromosome than on the autosomes (Maynard Smith and Haigh 
1974; Charlesworth et al. 1987). 




Figure 1 . Mean pairwise divergence (Dx/) along each major chromosomal arm between species of the D. melanogaster corr\p\ex. The following species 
pairs are shown: D. mauritiana-D. simulans (yellow), D. sechellia-D. simulans (orange), D. mauritiana-D. sechellia (green), D. simulans-D. melanogaster 
(gray), and D. mauritiana-D. melanogaster (red). The sliding window analysis was performed using a window size of 500 kb and a step size of 1 00 kb; 
chromosomal coordinates are those of D. mauritiana. 
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Table 1 . Mean pairwise divergence (Dxy) between D. mauritiana, D. simulans, and D. meianogaster 
based on alignments of repeat-masl<ed genomes 





D. mauritiana— D. 
simulans^ 


D. mauritiana— D. 
meianogaster 


D. simulans^— D. 
meianogaster 


X 


0.0168 


0.0548 


0.0540 


2L 


0.0216 


0.0503 


0.0496 


2R 


0.0201 


0.0481 


0.0474 


3L 


0.0211 


0.0496 


0.0488 


3R 


0.0198 


0.0482 


0.0474 


4 


0.0121 


0.0781 


0.0785 


Mean Dxy autosomes 


0.0206 


0.0491 


0.0483 


% Dxy X of autosomes 


81.5 


111.7 


111.9 


P-value for difference between X 


<2.2 X 10-^^ 


<2.2 X 10-^^ 


<2.2 X 10-^^ 



chromosome and autosomes 
(Wilcoxon ranl<-sum test) 



Means are based on nonoverlapping 1 0-l<b windows. 
^Assembly based on the African D. simulans strain Kib32. 



Mean pairwise divergence (Dxy) between D. meianogaster and 
D. mauritiana is significantly higher on the X chromosome 
(mean Dxy = 0.0548) than on the major autosomes (mean Dxy of 
the major autosomal arms = 0.0491, two-tailed Wilcoxon rank- 
sum test based on nonoverlapping 10-kb windows, P < 2.2 X 10~^^). 
The same pattern is observed for the species pair D. meianogaster 
and D. simulans (mean Dxy on the X chromosome = 0.0540; 
mean Dxy on the major autosomes = 0.0483) (Fig. 1; Table 1), 
which is consistent with the genome-wide data of Begun et al. 
(2007). 

Interestingly, comparisons within the D. simulans clade 
show the opposite pattern: Mean pairwise divergence between 
D. mauritiana and D. simulans, for example, is higher on the major 
autosomal arms {Dxy = 0.0206) than on the X chromosome (Dxy = 
0.0168, two-tailed Wilcoxon rank-sum test based on nonover- 
lapping 10-kb windows, P < 2.2 X 10~^^). This pattern of a higher 
divergence on the autosomes holds for all comparisons among 
species of the D. simulans clade (Fig. 1; Table 1; Supplemental Table 
S3) and has been noted previously in the D. simulans-D. sechellia 
comparison (Singh et al. 2008). Reduced divergence on the X 
chromosome compared with the autosomes could be explained by 
hybridization between species of the D. simulans clade (Ballard 
2000; Morton et al. 2004; Nunes et al. 2010). If the X chromosome 
experiences more interspecific gene flow 
than the autosomes, this would result in 
a higher divergence on the autosomes. 
Nevertheless, since Garrigan et al. (2012) 
found twice as many fragments with 
a putative introgession signal on the au- 
tosomes than on the X chromosome 
(Garrigan et al. 2012), we consider this 
scenario not very likely. 

Another cause of lower divergence 
on the X chromosome could be less an- 
cestral polymorphism on the X chro- 
mosome than on the autosomes (Singh 
et al. 2008). Alternatively, selection on the 
short time scale could be mainly operat- 
ing on standing variation rather than 
on new mutations (Orr and Betancourt 
2001). Hence, assuming that in the 
D. simulans clade selection acts mainly on 
shared standing variation, the time scale 
may be too short to notice a higher sub- 



stitution rate on the X chromosome. In 
contrast, comparisons involving D. meiano- 
gaster encompass longer time intervals 
allowing for more novel mutations and 
fewer shared mutations, which makes 
the higher substitution rate on the X 
chromosome visible. 



impact of tlie recombination landscape 
on tlie partitioning of variation 

When comparing levels of polymor- 
phism in D. mauritiana to those in 
D. meianogaster, we find that D. mauritiana 
is 40%-50% more variable than a cosmo- 
politan D. meianogaster population (Table 2; 
for further details, see Supplemental Re- 
sults). While a higher level of overall 
polymorphism has been suggested previously based on a small 
number of loci (Hey and Kliman 1993; Moriyama and Powell 
1996), our Pool-seq data allow us to address the distribution of 
variability along all chromosomal arms. 

It is well understood that the recombination landscape in D. 
meianogaster varies along the chromosomes. Both telomeres and 
centromeres have a reduced recombination rate, but while the drop 
in recombination rate is abrupt at the telomeres, a gradual decrease 
in recombination rate over several megabases is observed toward 
the centromere on all major autosomal arms (True et al. 1996). D. 
mauritiana not only has a higher genome-wide recombination rate 
but also shows an important difference in the recombination 
landscape: Instead of an extended gradual decrease in re- 
combination rate near the centromere, the suppression of re- 
combination is restricted to a very small pericentric region (True 
et al. 1996). 

Since the correlation between recombination rate and vari- 
ability is well-studied in D. meianogaster (Begun and Aquadro 1992; 
Hudson 1994), we were interested if the change in recombination 
landscape affects the pattern of variability in genomic regions 
toward the centromere. Figure 2 shows that in D. meianogaster, 
polymorphism declines toward the centromeres, whereas in 
D. mauritiana, levels of variability remain almost flat throughout 



Table 2. Mean nucleotide diversity (tt) and mean Tajima's D per chromosomal arm in 
D. mauritiana compared with the D. meianogaster population from Portugal 





Mean it 


Mean it 


Tajima's D 


Tajima's D 


D. mauritiana^ 


D. meianogaster^ 


D. mauritiana^ 


D. meianogaster^ 


X 


0.0059 


0.0039 


-1.94 




2L 


0.0092 


0.0077 


-1.70 


-1.21 


2R 


0.0087 


0.0060 


-1.71 


-1.41 


3L 


0.0095 


0.0066 


-1.67 


-1.40 


3R 


0.0086 


0.0059 


-1.73 


-1.50 


4 


0.0011 


0.0009 


-2.20 


-2.42 


Mean autosomes 


0.0090 


0.0066 


-1.70 


-1.38 


% X of autosomes 


65.7 


60.1 






% X* 4/3 of autosomes 


87.7 


80.1 







Both data sets were repeat-masked, and means were calculated from nonoverlapping 1 0-kb windows. 
^The D. mauritiana data set was analyzed using a minimum count of 3, a minimum coverage of 6, and 
a maximum coverage of 250; the D. meianogaster data set was analyzed using a minimum count of 2, 
a minimum coverage of 4, and a maximum coverage of 1 50. For calculation of Tajima's D, both data sets 
were subsampled to a 30-fold coverage and analyzed without correcting for sequencing errors and 
multiple sampling. 
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Figure 2. Nucleotide diversity (it) along the major chromosomal arms in D. mauritiana (red line) and D. melanogaster (gray line). The sliding window 
analysis was performed using 500-kb windows with a step size of 1 00 kb; chromosomal coordinates have been adjusted to D. melanogaster. 



the entire chromosome. Moreover, not only is the level of vari- 
ability reduced in low-recombining regions in D. melanogaster, but 
the allele frequency spectrum is affected as well. Tajima's D (Tajima 
1989) is a frequently used summary statistic, which describes de- 
viations of the allele frequency spectrum from the standard neu- 
tral model. We plotted Tajima's D along the D. melanogaster and 
D. mauritiana chromosomes and observed more negative Tajima's 
D values toward the centromere in D. melanogaster (Fig. 3). A 
similar trend was seen for Tajima's D of synonymous sites (Sup- 
plemental Fig. SI): Consistent with no reduced recombination 
rate, Tajima's D remains unaffected by proximity to the centro- 
mere for most of the D. mauritiana chromosomes. This shift to- 
ward more negative Tajima's D values in low-recombining regions 
of D. melanogaster is consistent with selection at linked sites af- 
fecting neutral variability, either due to recurrent sweeps of fa- 
vorable mutations (hitchhiking) (Maynard Smith and Haigh 
1974) or, possibly, due to background selection, caused by the 
removal of linked deleterious mutations (Charlesworth et al. 
1993). 

Because low recombination rates will decrease the efficacy of 
selection, we compared the ratio of nonsynonymous to synony- 
mous polymorphisms along the chromosomes of both species (as 
in Presgraves 2005; Betancourt et al. 2009). In D. melanogaster, the 
number of nonsynonymous substitutions relative to synonymous 
ones increases with the decrease in recombination rate toward the 
centromere (Presgraves 2005). In D. mauritiana, however, almost 
no effect can be noticed (Fig. 4). 



The effects of elevated recombination rates in D. mauritiana 
are further apparent from the patterns of codon usage (for details, 
see Supplemental Results). 

Signatures of positive selection in D. mauritiana 

The neutral theory predicts a correlation between polymorphism 
and divergence. The McDonald-Kreitman test builds on this 
prediction and compares the ratio of synonymous and non- 
synonymous polymorphism to the ratio of synonymous and 
nonsynonymous divergence; under neutrality, these quantities 
will be equal (McDonald and Kreitman 1991). Using a polarized 
McDonald-Kreitman test, we surveyed polymorphism and di- 
vergence (from D. melanogaster and Drosophila yakuba) for 10,217 
genes in D. mauritiana. 

We found 43 genes (FDR < 0.05) that deviated significantly 
from the neutral expectation in the polarized test of D. mauritiana 
with D. melanogaster as reference and D. yakuba as outgroup 
(Supplemental Table S4). A detailed list of significant genes, in- 
cluding those identified by unpolarized versions of the McDonald- 
Kreitman test, are shown in Supplemental Tables S5-S7. 

While several of these genes overlapped with previous studies 
(for further details, see Supplemental Results), we made three 
particularly interesting observations in D. mauritiana. 

First, we find strong evidence for positive selection for a gene 
that has been proposed to cause morphological divergence 
(number of sex comb teeth) between the two sister species 
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Figure 3. Tajima's D along the major chromosomal arms in D. mauritiana (red line) and D. melanogaster (gray line). The sliding window analysis was 
performed using 500-kb windows with a step size of 1 00 kb; chromosomal coordinates have been adjusted to D. melanogaster. 



D. simulans and D. mauritiana. Graze et al. (2007) identified 
CD98hc by tissue-specific gene expression differences between 
both species but did not characterize its molecular evolution. Our 
analysis suggests that not only regulatory changes but also struc- 
tural variation (Hoekstra and Coyne 2007) contribute to mor- 
phological divergence. 

Second, we did not detect an accelerated rate of evolution for 
olfactory and gustatory receptor (Or and Gr) or Accessory gland 
protein (Acp) genes, some of which have been found to evolve 
rapidly in other Drosophila species (Begun and Lindfors 2005; Guo 
and Kim 2007). While gene families are more likely to be excluded 
from de novo assemblies, we think that species-specific selection 
patterns are the better explanation (for further details, see the 
Supplemental Discussion). 

Third, the overrepresentation of nucleoporin genes among 
the selected genes suggests that genomic conflict may be one 
major driver of adaptive evolution in D. mauritiana. The most 
significant Gene Ontology (GO) term for genes significant at 
FDR 0.05 (43 genes) in the polarized MK test is "SMAD protein 
import into the nucleus" (Supplemental Table S8; for the GO en- 
richment analysis of the unpolarized McDonald-Kreitman tests, 
see Supplemental Tables S9, SIO). This pointed to nucleoporin 
genes, some of which have been previously described to be rapidly 
evolving and to cause hybrid incompatibility (Presgraves 2003; 
Presgraves and Stephan 2007; Tang and Presgraves 2009). 

Further manual inspection of the top candidates from the 
polarized test identified three nucleoporin genes. The gene asso- 



ciated with the most significant MK test, CG8771, is a homolog of 
the human nucleoporin gene Nupl88, which is involved in con- 
trolling membrane protein traffic and maintenance of nuclear 
membrane homeostasis (Theerthagiri et al. 2010). The yeast ho- 
molog NuplSS plays a role in structural organization of the nuclear 
pore (Nehrbass et al. 1996; Miller et al. 2000). As expected for 
a member of a nucleopore complex, CG8771 interacts with several 
other nucleopore proteins (as indicated in the protein-protein 
interaction database STRING 9.0) Qensen et al. 2009). Inter- 
estingly, two of the interacting partners, Nupl07 and CGI 1943, 
a homolog of the human NUP205 gene, were also found among the 
top 43 candidates. One of them, CGI 1943, has been previously 
described as a rapidly evolving gene in a comparison of D. simulans 
and D. melanogaster (Jagadeeshan and Singh 2005). Both Nupl07, 
part of the Nupl07-160 complex (Vasu and Forbes 2001), and 
CGI 1943, a member of the Nup53-93 complex (Chen and Xu 
2010), appear not only to interact with CG8771 but also with each 
other Qensen et al. 2009) (but see Theerthagiri et al. [2010] for 
evidence against an interaction between human NUP188 and 
NUP205). Since several Nups have not yet been identified as 
nucleoporins in the D. melanogaster annotation r.5.32 and are 
thus missing in the GO databases (e.g., CG8771), the GO term 
analysis does not adequately address whether or not Nups are 
overrepresented among our candidate genes. We thus tested fur- 
ther for an overrepresentation of nucleoporins among our candi- 
date genes by assuming that about 30 nucleoporins exist in the D. 
mauritiana genome (Wente and Rout 2010) and find that Nups are 



Genome Research 103 



www.genome.org 



Nolte et al. 




2L 




2R 




3L 





Figure 4. Ratio of nonsynonymous and synonymous nucleotide diversity (iTN/'n-s) along the major chromosomal arms in D. mauritiana (red line) and D. 
melanogaster (gray line). The sliding window analysis was performed using 50 genes per window and a step size of one gene. For D. mauritiana, genes from all 
four gene sets were included and matched with the orthologous gene in D. melanogaster. Genes with a ttn/tts ratio >3 were excluded from both data sets. 



highly significantly overrepresented (two-tailed P < 0.0001, x -test 
with Yates correction). 

Given this overrepresentation of Nups, we searched for fur- 
ther evidence of positive selection operating on additional Nups by 
relaxing our search criteria. Nupl54 is significant in the un- 
polarized test with D. melanogaster at FDR 0.1 and at FDR 0.001 
with D. yakuba as outgroup. Nupl54 is a conserved nucleoporin 
essential for viability (Kiger et al. 1999) and crucial for normal 
oogenesis and spermatogenesis (Gigliotti et al. 1998; Colozza et al. 
2011), and interacts with CG8771 and CGI 1943 Qensen et al. 
2009), homologs of the human NUP188 and NUP20S genes, re- 
spectively. Nupl60, a hybrid lethality gene between D. simulans 
and D. melanogaster (Tang and Presgraves 2009), is significant at 
FDR 0.01 (FDR 0.001) in the unpolarized test with D. melanogaster 
(D. yakuba) as reference, and Nupl33 at FDR 0.1 in the polarized 
test (Supplemental Tables S5-S7). 

Polarized tests based on D. melanogaster as reference are not 
suited to determine whether positive selection predates the split of 
D. simulans and D. mauritiana or is still ongoing after the split of 
the two species. Given the strong evidence for positive selection 
operating on Nups, we reasoned that we should have enough 
power to identify ongoing positive selection after the species split 
and repeated the polarized tests using D. simulans as reference and 
D. melanogaster ds outgroup. We analyzed the three candidate Nups 
CG8771, CG11943, and Nupl07, their interaction partners as 
listed in the STRING interaction database v. 9.0 (Jensen et al. 2009), 
as well as Nupl60, and found strong evidence for ongoing positive 
selection after the split of D. mauritiana and D. simulans for several 



Nups (Table 3). Analyzing Pool-seq data from African D. simulans 
(V Nolte and C Schlotterer, unpubl.) with D. mauritiana as refer- 
ence, we also find evidence for ongoing positive selection in D. 
simulans. The X-linked gene, CGI 1943, a homolog of the human 
NUP205 gene, shows one of the strongest signatures of recent 
positive selection in both species. Overall, the evidence for positive 
selection was stronger in D. simulans than in D. mauritiana: Most 
nucleoporins with signatures of ongoing rapid evolution in D. 
mauritiana show an even more significant test result in D. simulans, 
and two nucleoporins {Nupl33 and Nupl53) appear to evolve 
rapidly in D. simulans only, but not in D. mauritiana. 

Given the striking evidence for rapid evolution of Nups, 
which is possibly driven by intragenomic conflict, we turned our 
attention to RNAi genes, which are also thought to evolve rapidly 
due to genomic conflict (Obbard et al. 2006, 2009a,b). Only 16 out 
of 23 RNAi genes studied by Obbard et al. (2006, 2009a) and 
Kolaczkowski et al. (2011) were included in the initial D. mauritiana 
annotation. Hence, we manually curated the annotation of the 
seven missing RNAi genes (AG02, armi, krimp, AG03, mael, rhi, and 
squ). Consistent with positive selection, we found two RNAi genes 
{aub and AG02) to show a significant polarized MK test with D. 
melanogaster ds reference andD. yakuba as outgroup (Supplemental 
Table Sll). Three other genes (arm/, Fmrl, and Dcr-2) were only 
significant (P < 0.05) when no correction for multiple testing was 
applied. Two genes in the D. mauritiana (aub and dcr-2) and one 
gene in the African D. simulans data set (armi) showed significant 
evidence for ongoing positive selection after the species split 
(Supplemental Table SI 2). 
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Table 3. P-values of polarized McDonald-Kreitman tests at candidate nucleoporin genes and 
some of their interacting partners using D. mouritiono and an African D. simulans sample with 
D. simulans and D. mauritiana, respectively, as closely related reference, and D. melanogaster or 
D. yal<uba as outgroup 



Polymorphism data 


D. mauritiana 


African D. simulans 


Reference species data 


f) mplnnnnn^tpr 


f) ^imnlnn^ 

L/* <3f f f f Cf f Cf f f<3 


f) mplnnnnn^fpr 


f) mnuritinnn 


Outgroup species 


D. yaliuba 


D. melanogaster 


D. yal<uba 


D. melanogaster 


CG8771 


0.0000*** 


0.0339 


0.0000*** 


0.0436 


CG1 1943 


0.0004** 


0.0000*** 


0.0000*** 


0.0000*** 


Nup107 


0.0000*** 


0.0202 


0.0021** 


0.0026** 


Nup133 


0.0016** 


1 .0000 


0.0002*** 


0.0321 


Nupl53 


0.0208* 


0.2114 


0.0350 


0.0114* 


Nup75 


0.1042 


0.0551 


0.0306 


0.0718 


Nup154 


0.0031** 


0.0016* 


0.0065* 


0.0000*** 


CG6540 


0.3261 


0.2774 


0.5820 


0.2063 


Nup62 


0.1937 


0.5211 


0.2418 


1 .0000 


Nup44A 


1 .0000 


1 .0000 


1 .0000 


1 .0000 


Nup98 


1 .0000 


0.3904 


0.5204 


0.0980 


Nup96 


0.0060* 


0.5410 


0.0014** 


1 .0000 


Nup160 


0.0005** 


0.0557 


0.0001*** 


0.0000*** 



Asterisks denote genes remaining significant after correcting for multiple testing 
(*) FDR qf-value <0.05. 
(**) FDR qf-value <0.01 . 
(***) FDR qf-value <0.001 . 



A polymorphism trough around two loci involved 
in genomic conflict 

Classic selective sweeps, in which the favorable allele starts at 
a very low frequency and increases until it (almost) reaches fixa- 
tion, cause a characteristic imprint on the polymorphism pattern 
in the genome (Smith and Haigh 1974; Kaplan et al. 1989). The 
variability in the genomic region flanking the target of selection is 
strongly reduced and increases gradually with distance from the 
selected site. The shape of such a trough depends on various pa- 
rameters, such as the initial frequency of the selected allele, the 
selection coefficient, and the recombination rate. Figure 5 shows 
the partitioning of variation along the D. mauritiana X chromo- 
some. Two very pronounced troughs in variability can be recog- 
nized that could not be attributed to alignment artifacts (see Sup- 
plemental Results). In both regions with reduced variability, we 
noticed a high differentiation (Fst) (data not shown) from African 
D. simulans, but no increase in sequence divergence (Dxy) (Fig. 1). 

The first region encompasses —600 kb with a threefold re- 
duction in variability relative to the average X-chromosomal di- 
versity (mean it for the region = 0.002 vs. mean X-linked it = 
0.0059; two-tailed Wilcoxon rank-sum test, nonoverlapping lO-kb 
windows, P < 2.2 X 10~^^). In the central position of the trough, 
the variability is even further reduced (mean tt = 0.001, coordinates 
8.75-9.05 Mb). In addition, we observed a pronounced reduction 
in Tajima's D values compared with the remainder of the X chro- 
mosome (two-tailed Wilcoxon rank-sum test, nonoverlapping 10- 
kb windows, P < 2.2 X 10"^^) (Fig. 3). 

The width of the trough suggests an exceptionally strong se- 
lective sweep, since it is located in a genomic region of normal to 
high recombination (True et al. 1996), and no common inversion 
polymorphism has been described in D. mauritiana (for review, see 
Aulard et al. 2004). A close inspection of the function of the about 
3 7 genes in the genomic region of reduced variability did not show 
any gene for which adaptive evolution was previously suggested. 
The only gene that could tentatively be associated with positive 



selection is Ser7, since it seems to be in- 
volved in immune response (Irving et al. 
2001; Hill-Burns and Clark 2009). We 
thus turned our attention to other possi- 
ble causes of selective sweeps. In addition 
to beneficial alleles that provide some 
fitness benefit to the organism, alleles 
involved in genomic conflict can also 
have very strong selective advantages 
(Presgraves et al. 2009) and thus the po- 
tential to drive selective sweeps. We note 
that a pair of genes causing sex-ratio dis- 
tortion in D. simulans is located within 
the region with the most extreme re- 
duction in variability (Fig. 5; Supple- 
mental Fig. S2): Alleles of the paralogous 
genes Mother of Dox (MDox) and Dox 
function as drivers in a well-characterized 
sex-ratio meiotic drive system in D. sim- 
ulans (Tao et al. 2007a; Kingan et al. 
2010). The estimated selection coefficient 
s ranges from 0.12 to 0.39, depending on 
the parameter values used (Supplemental 
Table SI 3), which could be consistent 
with strong selection during genomic 
conflict (Curtsinger 1984). 
The second trough in variability on the X chromosome 
extends over an even larger region but shows a less pronounced 
reduction in variability. A genomic region of —1000 kb between 
coordinates 16-17 Mb of the D. mauritiana reference genome 
shows an approximately twofold reduction in variability (mean 
IT = 0.0030 vs. mean X-linked tt = 0.0059, two-tailed Wilcoxon 
rank-sum test, nonoverlapping lO-kb windows, P < 2.2 X 10~^^) 
(Fig. 5). Similar to the first sweep region, Tajima's D is also lower 
than in the remainder of the X chromosome (two-tailed Wilcoxon 
rank-sum test of Tajima's D values for 10-kb windows in each 
sweep region vs. 10-kb windows in the remainder of the X chro- 
mosome, P < 2.2 X 10~^^) (Fig. 3). Estimates for the selection co- 
efficient range from 0.04 to 0.46 (Supplemental Table SI 3). 

The region of reduced variability contains more than 60 genes 
with many of them having no known function. The strongest re- 
duction in variability is observed at the proximal border of the 
sweep window and harbors the haplolethal 16F gene cluster de- 
scribed in D. melanogaster (Prado et al. 1999). 

Surprisingly, at the center of the window of reduced vari- 
ability, we find another gene with a well-documented role in spe- 
ciation and, possibly, genomic conflict, Odysseus (OdsH) (Ting et al. 
1998; Bayes and Malik 2009). In addition, the sweep around the 
OdsH gene extends to the region in which the enhancer of Dox 
(E[Dox]) has been located, a not yet precisely mapped factor 
proximal to the gene forked that enhances the sex-ratio distorting 
effect of Dox (Fig. 5; Tao et al. 2007a). 

Discussion 

Quality of draft genomes based on paired-end illumina 
sequencing 

Here, we have built a high-quality draft genome of a Drosophila 
species using only short paired-end reads. Using a conservative, 
D. melanogaster-centric annotation, we recovered a similar number 
of genes as a previous genome project did for D. simulans and 
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Figure 5. Nucleotide diversity (it) along the D. mauritiana X chromosome. The location of genes potentially causing the two selective sweeps are 
indicated: (large red diamond) MDox/Dox; (large blue diamond) OdsH; (small red diamond) E(Dox). Nucleotide diversity (tt) is plotted in nonoverlapping 
1 0-kb windows. 



D. sechellia (Clark et al. 2007). The analyses in this study demon- 
strate that a draft genome facilitates addressing several important 
evolutionary questions. Nevertheless, we caution that it has some 
shortcomings. First, we are not able to provide a correct annotation 
of transposable elements in the D. mauritiana reference genome, 
since repetitive structures cannot be reliably assembled with short 
reads (Phillippy et al. 2008). Furthermore, heterochromatic se- 
quences are poorly represented. Since the traces of genomic conflict 
are so apparent in D. mauritiana, and heterochromatic sequences 
may be important players in genomic conflict and speciation 
(Brideau et al. 2006; Cattani and Presgraves 2009; Ferree and Barbash 
2009; Meiklejohn et al. 2011), we caution that the genomic sig- 
natures of ongoing genomic conflict are probably incomplete 
without the corresponding heterochromatic regions. Finally, gene 
families composed of closely related paralogs tend to be collapsed 
into a single copy during de novo assembly (e.g., the Hsp70 gene 
cluster). 

Nevertheless, our assembly recovered a large fraction of gene 
families that are frequently identified as targets of positive selec- 
tion (Supplemental Table SI 4). Accessory gland protein (Acp) and 
seminal fluid protein (Sfp) genes, which belong to recently dupli- 
cated gene families, frequently evolve under positive selection (for 
review, see Ram and Wolfner 2007), but we did not find genes in 
these categories among the top candidates for adaptively evolving 
genes in the polarized McDonald-Kreitman test. Similarly, olfac- 
tory (Or) and gustatory receptor (Gr) genes are frequently involved 
into ecological adaptation and speciation (Guo and Kim 2007; 
McBride 2007; Tunstall et al. 2007; Gardiner et al. 2008, 2009), but 
we also failed to identify such genes among the top candidates. 
Since our annotation recovered a large fraction of the adaptively 
evolving genes described in D. melanogaster, we consider it un- 
likely that the absence of a molecular signature of adaptation in 
D. mauritiana is an annotation artifact. Rather, we speculate that 
the selective forces driving an adaptive response of these genes in 
other Drosophila species are less prominent in D. mauritiana. 

Nucleoporins as a preferential target for positive selection 

Despite the fact that the function and composition of nuclear 
pore complexes are highly conserved, recent work showed that 
some of their components, the nucleoporins, evolve rapidly, and 



two of them cause hybrid lethality in Drosophila (Presgraves et al. 
2003; Presgraves and Stephan 2007; Tang and Presgraves 2009). 
Presgraves and Stephan (2007) suggest three forms of genetic 
conflict that could drive the rapid evolution of nucleoporins: (1) 
host-parasite conflict due to viruses that need to enter via the 
gatekeeper nuclear pore complexes, which function to exclude 
invading viruses; (2) intragenomic conflict due to centromeric 
drive, since some Nups are associated with kinetochores; (3) 
intragenomic conflict due to other forms of segregation distor- 
tion, since nuclear pore complexes may potentially suppress 
them. Our analyses cannot distinguish between these hypothe- 
ses, but they provide additional evidence that many Nups evolve 
unusually rapidly due to positive selection. Furthermore, we 
show that the rapid evolution is not restricted to some time 
during the divergence between D. melanogaster and D. simulans 
(Presgraves et al. 2003; Presgraves and Stephan 2007; Tang 
and Presgraves 2009), but that positive selection is an ongoing 
process that continues after the split between D. mauritiana and 
D. simulans. 

Our evolutionary analyses provide some insights into 
how nucleoporin genes may be involved in genomic conflict. 
CGI 1943, a homolog of the human NUP205 gene, shows strong 
evidence of ongoing selection in D. mauritiana and D. simulans. 
Presgraves et al. (2003) and Tang and Presgraves (2009) previously 
identified Nup96 and Nupl 60 as one cause of hybrid lethality between 
D. simulans andD. melanogaster, due to an interaction with an as-yet- 
unidentified X-linked factor. Given that CGI 1943 is located on the 
X chromosome, we speculate that it may be an alternative in- 
teraction partner of Nupl60 and/or Nup96 instead of (or in addi- 
tion to) their suggested Nupl 53. 

While the McDonald-Kreitman test with D. simulans as a 
reference indicates that the high rate of sequence evolution is 
ongoing in D. mauritiana, there was no clear signature of a selective 
sweep at nucleoporin genes in the polymorphism data (Fig. 6). 
Since this observation is consistent with the analyses of Presgraves 
and Stephan (2007) and Tang and Presgraves (2009), we hypoth- 
esize that positively selected mutations in interacting proteins may 
lead to complex sweep dynamics, which could retard the spread of 
a beneficial mutation. As a consequence, beneficial mutations at 
Nups may result in a signature that resembles more a soft sweep 
(Hermisson and Pennings 2005) rather than a hard sweep. 
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Figure 6. Nucleotide diversity (it) and Tajima's D at selected nucleoporin genes in D. mauritiana. (A) 
Nucleotide diversity (it) at selected nucleoporin genes in comparison to the average chromosome-wide 
nucleotide diversity (tt) of the corresponding chromosome (dashed line). (B) Tajima's D at selected 
nucleoporin genes in comparison to the average chromosome-wide Tajima's D of the corresponding 
chromosome (dashed line). 



Two recent selective sweeps potentially associated with genomic 
conflict 

One of our most striking findings is that tfie D. mauritiana genome 
fiarbors two large regions — 0.6 and 1 Mb — of profoundly reduced 
diversity, suggesting that two exceptionally strong selective sweeps 
have occurred in this species. In comparison, genomic signatures 
of previously described sweeps in Drosophila are much narrower 
with estimated selection coefficients being about an order of 
magnitude lower: The broadest valley of reduced variability de- 
scribed to date is caused by the insertion of a transposable element 
next to the gene Cyp6gl, which confers resistance to DDT — the 
sweep extends over —100 kb and has an associated selection co- 
efficient of 0.022 (Schlenke and Begun 2004). 

The gene that could most parsimoniously be assumed to drive 
one of the sweeps on the D. mauritiana X chromosome is OdsH. 
This gene was initially described as a "speciation gene" that causes 
hybrid male sterility between D. mauritiana and D. simulans and 
shows a strongly accelerated rate of evolution in D. mauritiana 
(Ting et al. 1998). Later it was recognized that the OdsH gene prod- 



uct interacts with Y-linked heterochroma- 
tin in hybrids between D. mauritiana and 
D. simulans but not in pure species, sug- 
gesting that it could be involved in geno- 
mic conflict (Bayes and Malik 2009; 
Meiklejohn et al. 2011). Consistent with 
previous reports, we observe a much higher 
number of nonsynonymous fixations 
along the D. mauritiana than the D. simu- 
lans lineage, but in both species neither 
the homeodomain nor the entire gene 
showed evidence of positive selection in 
a polarized McDonald-Kreitman test. 

The second selective sweep might be 
caused by the Winters meiotic drive sys- 
tem, which is well-characterized in D. 
simulans (Tao et al. 2007a,b; Kingan et al. 
2010). This system consists of at least 
three components: the drivers MDox and 
Dox, the autosomal dominant suppressor 
Nmy, and the enhancer of Dox [E(Dox)]. 
The sequences of these genes are highly 
similar, partially derived from one an- 
other by tandem duplication and retro- 
transposition, and contain tandem repeat 
structures (Tao et al. 2007a,b). As a result 
of the sequence similarity and repetitive 
structure of these loci, reliable sequence 
analysis of this region is difficult, even 
with targeted PGR approaches (Tao et al. 
2007a,b), and essentially impossible with 
genome-wide short read sequencing. 

Our D. mauritiana strains did not 
show obvious signs of sex-ratio distor- 
tion, but theoretical models predict that 
ongoing intragenomic conflict results 
in rapid cycles during which compet- 
ing alleles rise and fall in frequency 
(Charlesworth and Hartl 1978; Carvalho 
and Vaz 1999; Hall 2004). Driver alleles, 
such as sex-ratio distorters, will increase 
in frequency until a suppressor allele ar- 
rives, which spreads, and the genomic conflict ultimately disap- 
pears (i.e., the population reaches a balanced sex ratio). While an 
almost complete sweep of a strongly distorting allele appears un- 
likely, theoretical models have described situations under which 
such a pattern is predicted (Charlesworth and Hartl 1978; Carvalho 
and Vaz 1999; Hall 2004). 

In some D. simulans populations, the Dox gene shows evi- 
dence of a partial selective sweep (Kingan et al. 2010), but it is 
difficult to distinguish highly localized selective sweeps from 
random fluctuations in variability due to the bottleneck associated 
with the out of Africa expansion (Jensen et al. 2005). We fur- 
ther scrutinized the genomic region around Dox using Pool-seq 
data from African D. simulans (V Nolte and C Schlotterer, unpubl.) 
and did not note any pronounced trough in variability around the 
Dox region, suggesting that, at least in the African population 
sample, no evidence for a selective sweep comparable to the one in 
D. mauritiana could be detected (Supplemental Fig. S3). 

The Dox system was initially discovered in D. simulans 
(Dermitzakis et al. 2000), and the driver loci MDox and Dox have 
not yet been functionally analyzed in D. mauritiana. While an 



Genome Research 107 



www.genome.org 



Nolte et al. 



analysis of the D. mauritiana alleles present at MDox, Dox, and Nmy 
is not possible from the Pool-seq data (see above), Tao et al. (2007b) 
inferred from sequence comparison that a functional suppressor 
allele at Nmy is present in D. mauritiana, suggesting the existence of 
a functional distorter (Tao et al. 2007a,b). No selective sweep could 
be detected in the Nmy gene region (data not shown), suggesting 
that no new allele at Nmy has swept through D. mauritiana. 

Given the dynamic evolution of repetitive structures in the 
Winters sex-ratio genes, it is possible that in D. mauritiana, a new 
driver allele at Dox evolved and caused the pronounced sweep 
signature, while in D. simulans, the signature of an older sweep has 
already been erased. Alternatively, we could speculate that it may 
be easier for an allele to sweep in D. mauritiana since D. mauritiana 
has almost no population structure (Nunes et al. 2010), while the 
cosmopolitan species D. simulans shows a higher level of pop- 
ulation differentiation (Hamblin and Veuille 1999). 

Theoretical studies predict that a beneficial allele will spread 
much faster in panmictic populations, whereas population sub- 
division and low migration rates lead to a delay in the fixation of 
a beneficial mutation (Barton 2000; Santiago and Caballero 2005; 
Kim and Maruki 2011). 

The importance of population structure for the detection of 
meiotic drive dynamics has also been highlighted in recent theo- 
retical work. Hall (2004) suggested that the hitherto absence of 
documented cycling behavior in natural Drosophila populations 
may be the result of migration between subdivided populations 
with different drive parameters. Instead, in isolated populations that 
share the same drive dynamics due to panmixia large fluctuations 
in driver and suppressor can be seen. Since D. mauritiana shows no 
population differentiation on Mauritius, we think that this could 
explain the difference with the other Drosophila species. 

It is apparent that more work is needed to characterize the 
driver, responder, and suppressor alleles in both species to shed 
further light onto the differences in evolutionary signatures ob- 
served between the two species. 

Methods 

D. mauritiana strains and Illumina sequencing 

We used the D. mauritiana isofemale strain MSI 7 (http://kyotofly. 
kit.ac.jp/cgi-bin/ehime/index.cgi, stock number E-18912) to gen- 
erate a D. mauritiana reference genome. Pool-seq data were ob- 
tained from 107 D. mauritiana lines collected at different time 
points and locations in Mauritius (Supplemental Table S2). Illu- 
mina libraries were generated following the instructions of the 
Illumina Paired-End Sample Preparation Kit and sequenced on a 
GAIIx. 



De novo assembly and annotation of a D. mauritiana reference 
genome 

To generate the D. mauritiana reference genome sequence, we 
initially performed a de novo assembly of Illumina reads using the 
software CLC Assembly Cell v. 3.1.0 beta2 (CLC Bio). In the second 
phase of the assembly procedure, we anchored de novo contigs on 
the reference genome of D. melanogaster r. 5.22 using the nucmer 
module in the MUMmer package v. 3.0 (Kurtz et al. 2004). The 
D. mauritiana chromosomes were built by overlapping or concat- 
enating contigs. The longest isoform of each D. melanogaster 
protein from FlyBase release 5.32 served as template for annota- 
tion. Each protein sequence was aligned to the D. mauritiana 



reference genome using exonerate v. 2.0 (Slater and Birney 2005). 
We generated four sets of gene annotations, using varying degrees 
of filtering criteria that are described in detail in Supplemental 
Methods. 



Divergence estimates and codon usage analysis 

We performed multiple alignments of the D. mauritiana, a 
D. simulans, the D. sechellia r.1.3, and the D. melanogaster r. 5.32 
genome sequences using MAUVE (Darling et al. 2010) and calcu- 
lated pairwise divergence Dxy between them using the PoPoolation 
package (Kofler et al. 2011). We used CAlcal v. 1.4 (Puigbo et al. 
2008) to determine the Codon Adaptation Index (CAI), originally 
developed by Sharp and Li (1987). 

Reference mapping and variability estimates in D. mauritiana 
Pool-seq data 

Paired-end reads of the pooled D. mauritiana sample were aligned 
to the MS 17 draft genome (or D. melanogaster genome) using bwa 
V. 0.5.8 (Li and Durbin 2009). Alignments were filtered for a min- 
imum mapping quality of 20 and for properly paired reads using 
SAMtools V. 0.1.9 (http://samtools.sourceforge.net/). Minimum 
requirements for coverage and allele count used in SNP calling are 
detailed in Supplemental Methods. Analyses of tt and Tajima's D 
were performed with the PoPoolation package (Kofler et al. 2011). 
To test for recurrent positive selection in the D. mauritiana line- 
age, we performed McDonald-Kreitman tests. Multiple align- 
ments of the coding sequence of each D. mauritiana gene with the 
orthologs of D. melanogaster r. 5.32 and D. yakuba r. 1.3 were 
generated using PRANK v. 100701 (Loytynoja and Goldman 
2005). We combined the interspecific with the intraspecific 
alignments using custom Perl scripts and performed McDonald- 
Kreitman tests using the MK.pl script obtained from http:// 
www.dpgp.org/aholloway/Software.html (Holloway et al. 2007). 
We calculated false discovery rates (FDR) using the LBE package 
(Dalmasso et al. 2005) and performed an analysis of Gene On- 
tology enrichment with GOrilla (http://cbl-gorilla.cs.technion. 
ac.il/) (Eden et al. 2009). Details of all analyses are provided in 
Supplemental Methods. 

Statistical tests 

Statistical tests were performed using R version 2.11.1 (The R Core 
Team 2010) unless stated otherwise. 

Data access 

All Illumina short reads used in this study are available from the 
NCBI Sequence Read Archive (SRA) (http://www.ncbi.nlm.nih.gov/ 
sra) under the following accession numbers: the single D. mauritiana 
reference strain MSI 7 under SRA058420, the D. mauritiana Pool- 
seq data under SRA058664, the D. simulans reference strain Kib32 
under SRA059282, and the African D. simulans Pool-seq data under 
SRA059292. The D. mauritiana strain MSI 7 reference genome 
and annotation are available at http://www.popoolation.at/ 
mauritiana_genome/index.html. A BAM file containing D. maur- 
itiana Pool-seq data is available at http://www.popoolation.at/ 
mauritiana_genome/index.html. A searchable, user-friendly ver- 
sion of the D. mauritiana Pool-seq and the African D. simulans Pool- 
seq data is available at http://www.popoolation.at/pgt/dmau_ 
browse.html and http://www.popoolation.at/pgt/dsim_browse. 
html. 
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